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10 NUCLEIC ACID ENCODING M. TUBERCULOSIS ALGU PROTEIN 

Background of the Invention 

Mycobacteria are gram-positive bacilli, nonmotile rod-shaped organisms that 
do not form spores. The composition of the cell wall includes a very high concentration of 

15 lipids complexed to a variety of peptides and polysaccharides. The unusual structure of the 
cell wall distinguishes mycobacteria from most other bacteria and is detectable by its 
resistance to acid-alcohol staining. 

The disease caused by M. tuberculosis is a progressive, deadly illness that tends 
to develop slowly and follows a chronic course (Plorde, 1994). It is presently estimated that 

20 one-third of the world's population is infected with M. tuberculosis, 30 million of whom have 
active disease (Plorde, 1994). An additional 8 million people develop the disease annually 
(Plorde, 1994). Most infections are caused by inhalation of droplet nuclei carrying the 
mycobacterium. A single cough can generate 3000 infected droplet nuclei and even 10 bacilli 
may be sufficient to cause a pulmonary infection. In addition to the primary infection, 

25 reactivation of the disease can occur in older people and in immunocompromised patients. 

When intracellular pathogens, such ^Mycobacterium tuberculosis, are ingested 
by macrophages the bacteria are under environmental stress. The genes required for survival 
following uptake by macrophages can provide insight into mycobacterial pathogenesis, and 
provide novel targets for developing antibacterial agents. The ability to adapt to the 

30 intracellular stress requires regulation of complex gene expression and this regulation may 
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be mediated in part by one or more alternative sigma factors. Therefore stress response 
alternative sigma factors (sigE family) from M. tuberculosis are potential novel targets for 

antibacterial therapeutics. 

Extracellular environmental stress can significantly affect the survival of the 
5 bacteria. As part of the adaptive response by the bacteria the alternative sigma factors play 
a critical role in coordinate regulation of gene expression. For example, survival following 
extreme temperature in Escherichia coli is regulated by a family of alternative sigma factors 
known as the sigE family (Keiichiro et al. , Raina et al. , Rouviere et al.). Alginate production 
in Pseudomonus aeruginosa is also regulated by the sigE family member known as the algU 
10 gene (Deretic et al.). Respiratory infections with mucoid P. aeruginosa in cystic fibrosis 
(CF) patients are the major cause of mortality. Although initial colonizing strains are 
nonmucoid, the bacteria are converted to mucoid P. aeruginosa in the CF lung. This 
conversion to mucoidy is regulated by the alternative sigma factor algU (Martin et al.). 

Sigma (<r) factors are positive regulators of general transcription initiation that 
15 enhance transcriptional specificity. The basic unit of the abacterial transcription apparatus 
is the DNA-dependent RNA polymerase holoenzyme, a complex consisting of five protein 
subunits: two copies of the a subunit and one copy each of the /J, 0' , and a subunits. The 
a, 0 and 0' subunits are invariant in a given bacterial species and together form core RNA 
polymerase. Open promoter complexes form only when holoenzyme is bound at a promoter 
20 (Gross et al., 1992). When the newly synthesized RNA chain is 8-9 nucleotides long, a 
factor dissociates from the complex and the elongation process is begun (von Hippel, et al., 
1992). After transcription is terminated, a factor rebinds core polymerase, creating 
holoenzyme for another round of initiation (von Hippel, et al., 1992). This series of 
biochemical activities has been termed "the transcription cycle". 
25 Rifampicin, a highly specific inhibitor of mycobacterium/RNA polymerase, is 

one of the primary drugs of choice for treatment of tuberculosis. Combination treatment with 
isoniazid is typical if there is no risk of developing multi-drug resistance. Prolonged 
treatment regimens are necessary and can take up to nine months. Failure to complete the 
prolonged treatment course is one of the contributing factors in the development of resistant 



WO 98/31789 



PCT/US98/01244 



3 

bacterial strains. Rifabutin is an effective analog, of rifampicin, but 70% of rifampicin- 

resistant strains are also rifabutin-resistant. 

Although RNA polymerase is a well-validated target for anti-mycobacterial 

therapy, discovery of inhibitors of M. tuberculosis RNA polymerase is hampered by a lack 
5 of information concerning components of the M. tuberculosis transcriptional apparatus, 

difficulties in obtaining sufficient yields of active enzymes for biochemical studies, and 

technical and biosafety concerns surrounding the handling of live cultures of M. tuberculosis. 

Establishment of an in vitro transcription system employing purified and reconstituted RNA 

polymerase would greatly advance efforts to identify new therapeutic agents active against 
10 tuberculosis. It is very possible that molecules that inhibit a functions may not affect 

eukaryotic general transcription. Thus, a factors are a reasonable target for development of 

transcriptional inhibitors. Therefore, molecules that inhibit a factor function may be used as 

general transcriptional inhibitors and antibacterial therapeutics. 

Accordingly, there is a need in the art for compositions and methods utilizing 
15 cloned genes and purified proteins derived from M. tuberculosis RNA polymerase. 

Summary of the Invention 

The present invention is based on the isolation and characterization of DNA 
encoding the a subunit of RNA polymerase derived from the algUgene from M. tuberculosis, 
20 In one aspect, the invention provides a purified, isolated nucleic acid having the sequence 
shown in Figure 3. The invention also encompasses sequence-conservative and function- 
conservative variants of this sequence. The invention also provides vectors comprising these 
sequences, and cells comprising the vectors. 

In another aspect, the present invention provides a purified, isolated 
25 polypeptide encoded by the nucleic acid sequence shown in Figure 3, as well as function- 
conservative variants thereof. 

In yet another aspect, the invention provides in vitro methods for high- 
throughput screening to detect inhibitors of M. tuberculosis RNA polymerase. The methods 
are carried out by the steps of: 
30 a) providing a mixture comprising 
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(i) purified M. tuberculosis RNA polymerase containing the 

algU <j factor and 

(ii) a DNA template encoding a promoter sequence that is 
recognized by M. tuberculosis RNA polymerase containing the algU subunit; 

b) incubating the mixture in the presence of test compounds to form test 
samples, and in the absence of test compounds to form control samples, under conditions that 
result in RNA synthesis in the control samples; 

c) measuring RNA synthesis in the test and control samples; and 

d) comparing the RNA synthesis detected in step (c) between the test and 
control samples. According to the invention, an RNA polymerase inhibitor is a test 
compound that causes a reduction in RNA synthesis measured in the test sample relative to 
RNA synthesis measured in the control sample. 

In yet another aspect, the invention provides in vivo methods for high- 
throughput screening to detect inhibitors of At. tuberculosis RNA polymerase. The methods 

are carried out by the steps of: 

a) providing a non-mycobacterial bacterial strain, preferably E. coli, that 

(i) has been transformed with a DNA template encoding a 
promoter sequence that is recognized by M. tuberculosis RNA polymerase containing the 
algU subunit, and 

(ii) expresses enzymatically active M. tuberculosis RNA 
polymerase (e.g., a, 0.> plus the algU a subunit disclosed herein); 

b) incubating the bacterial strain of (a) in the presence of test compounds 
to form test samples, and in the absence of test compounds to form control samples; 

c) measuring RNA synthesis in the test and control samples; and 

d) comparing the RNA synthesis detected in step (c) between the test and 
control samples. According to the invention, an RNA polymerase inhibitor is a test 
compound that causes a reduction in RNA synthesis measured in the test sample relative to 
RNA synthesis measured in the control sample. 

These and other aspects of the present invention will be apparent to those of 
) ordinary skill in the art in light of the present specification and appended claims. 
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Description of the Drawings 

Figure 1. PCR amplification of Af. tuberculosis H37Rv genomic DNA. 
lane M: DNA marker (123bp, Gibco-BRL), lane 1: primer PI only, lane 2: primer P2 only 
and lane 3: primer PI and P2. The amplified DNA fragment (arrow in Figure 1) was gel 
5 purified and subcloned into pCRScript (Stratagene) plasmid. 

Figure 2A. Southern blot analysis of M. tuberculosis H37Rv DNA and 
cosmid clones. A. M. tuberculosis H37Rv genomic DNA were digested with restriction 
enzymes: BamH I (lane 1), Pst I (lane 2), Pvu II (lane 3), Sma I (lane 4) and Xmn I (lane5) 
10 and analyzed by Southern hybridization using the PCR amplified DNA fragment as a probe. 
Sizes of DNA markers ( 35 S-DNA Marker, Amersham) are indicated in kb. 
2B. Two different positive clones (designated 2D 11 and 4D11) isolated from an M. 
tuberculosis cosmid library were digested with BamH I (lane 1 and 4), Pvu II (lane 2 and 5) 
and Sma I (lane 3 and 6) and hybridized with the PCR-generated sigma gene as a probe. 

15 

Fig. 3. Nucleotide and deduced amino acid sequences of the M. tuberculosis 
H37Rv algU gene. 

Fig. 4. Alignment of the inferred amino acid sequence of the M. tuberculosis 
20 (Mt) H37Rv algU gene with sequences of extracellular function family of sigma subunits from 
other bacteria (Streptomyces coelicolor, Pseudomonas aeruginosa , Escherechia coli and 
Hemophilus influenzae ) Shading indicates identical amino acid residues. Amino acid sequence 
alignments were performed using Meg Align (DNAStar). 

25 Detailed Discussion of the Invention 

All patents, patent applications and literature references cited herein are hereby 
incorporated in their entirety. In the case of inconsistencies, the present disclosure will 
prevail. 

The present invention is based on the isolation of a fragment of the M. 
30 tuberculosis algU gene, encoding an alternative a subunit of RN A polymerase. As described 
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in Example 1 below. PCR amplification of M. tuberculosis genomic DNA with primers 
based on the M. leprae algU DNA sequence generated an expected size of DNA (180 base 
pairs) (Figure 1). The PGR amplified DNA had >90% identity to the M. leprae gene. 
Southern blot analysis demonstrated the presence of a single copy of this gene in M 
5 tuberculosis (Figure 2A). The amplified DNA was utilized as a hybridization probe to recover 
the entire algU gene from a cosmid library of genomic DNA from virulent M. tuberculosis 
strain H37RV. Nucleotide sequencing indicated that the 675 bp M. tuberculosis algU open 
reading frame (ORF) encodes a protein of 24.3 kDa (225 amino acids) which shows significant 
structural similarity to the a subunits of diverse bacterial species with greatest identity to the 
10 stress related extracellular function family of a subunits of Streptomyces coelicolor, 
Pseudomonas aeruginosa, Escherochia coli and Hemophius influenzae. The sigma factors 
from S. coelicolor and P. aeruginosa , E. coli and H. influenzae are 24%, 20%, 21% and 16% 
identical to the M. tuberculosis sequence respectively (Figure 4). 

The P. aeruginosa algU gene is part of a large operon that contains genes for 
15 anti-sigma factors (mucA and mucB ) and a protease (mucD ) (Schurr et al.). Further 
nucletodide sequencing and availability of an integrated map of the genome of M tuberculosis 
H37Rv (Philipp et al., 1996) is expected to clarify the structural organization and position of 
the algU locus of M tuberculosis. 

In practicing the present invention, many techniques in molecular biology, 
20 microbiology, recombinant DNA, and protein biochemistry such as these explained fully in, 
for example, Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, Second 
Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York; DNA Cloning: 
A Practical Approach, Volumes I and II, 1985 (D.N. Glover ed.); Oligonucleotide Synthesis. 
1984, (MX. Gait ed.); Transcription and Translation, 1984 (Hames and Higgins eds.); A 
25 Practical Guide to Molecular Cloning; the series, Methods in Enzymology (Academic Press, 
Inc.); and Protein Purification: Principles and Practice, Second Edition (Springer-Verlag, 

N.Y.), may be used. 

The present invention encompasses nucleic acid sequences encoding the algU 
gene of M. tuberculosis, enzymatically active fragments derived therefrom, and related 
30 sequences. As used herein, a nucleic acid that is -derived from" a sequence refers to a nucleic 
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acid sequence that corresponds to a region of the sequence, sequences that are homologous or 
complementary to the sequence, and "sequence-conservative variants" and "function- 
conservative variants". Sequence-conservative variants are those in which a change of one or 
more nucleotides in a given codon position results in no alteration in the amino acid encoded 
5 at that position. Function-conservative variants are those in which a given amino acid residue 
in the algU subunit has been changed without altering the overall conformation and function 
of the polypeptide, including, but not limited to, replacement of an amino acid with one 
having similar physico-chemical properties (such as, for example, acidic, basic, hydrophobic, 
and the like). Fragments of the algU subunit that retain enzymatic activity can be identified 

10 according to the methods described herein, e.g., expression in E, coli followed by enzymatic 
assay of the cell extract. 

The nucleic acids of the present invention include purine- and pyrimidine- 
containing polymers of any length, either polyribonucleotides or polydeoxyribonucleotides or 
mixed polyribo-polydeoxyribo nucleotides. This includes single- and double- stranded 

15 molecules, i.e., DNA-DNA, DNA-RNA and RNA-RNA hybrids, as well as "protein nucleic 
acids" (PNA) formed by conjugating bases to an amino acid backbone. This also includes 
nucleic acids containing modified bases. The nucleic acids may be isolated directly from cells. 
Alternatively, PCR can be used to produce the nucleic acids of the invention, using either 
chemically synthesized strands or genomic material as templates. Primers used for PCR can 

20 be synthesized using the sequence information provided herein and can further be designed to 
introduce appropriate new restriction sites, if desirable, to facilitate incorporation into a given 
vector for recombinant expression. 

The nucleic acids of the present invention may be flanked by natural M. 
tuberculosis regulatory sequences, or may be associated with heterologous sequences, including 

25 promoters, enhancers, response elements, signal sequences, polyadenylation sequences, introns, 
5'- and 3'- noncoding regions, and the like. The nucleic acids may also be modified by many 
means known in the art. Non-limiting examples of such modifications include methylation, 
"caps", substitution of one or more of the naturally occurring nucleotides with an analog, and 
internucleotide modifications such as, for example, those with uncharged linkages (e.g., methyl 

30 phosphonates, phosphotriesters, phosphoroamidates, carbamates, etc.) and with charged 
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linkages (e.g., phosphorothioates, phosphorodithioates, etc.). Nucleic acids may contain one 
or more additional covalently linked moieties, such as, for example, proteins (e.g., nucleases, 
toxins, antibodies, signal peptides, poly-L-lysine, etc.), intercalators (e.g., acridine, psoralen, 
etc.), chelators (e.g., metals, radioactive metals, iron, oxidative metals, etc.), and alkylators. 
The nucleic acid may be derivatized by formation of a methyl or ethyl phosphotriester or an 
alkyl phosphoramidate linkage. Furthermore, the nucleic acid sequences of the present 
invention may also be modified with a label capable of providing a detectable signal, either 
directly or indirectly. Exemplary labels include radioisotopes, fluorescent molecules, biotin, 
and the like. 

The invention also provides nucleic acid vectors comprising the disclosed algU 
subunit sequences or derivatives or fragments thereof. A large number of vectors, including 
plasmid and fungal vectors, have been described for replication and/or expression in a variety 
of eukaryotic and prokaryotic hosts. Non-limiting examples include P KK plasmids (Clontech), 
pUC plasmids, pET plasmids (Novagen, Inc., Madison, WI), or pRSET or pREP (Invitrogen, 
15 San Diego, CA), and many appropriate host cells, using methods disclosed or cited herein or 
otherwise known to those skilled in the relevant art. Recombinant cloning vectors will often 
include one or more replication systems for cloning or expression, one or more markers for 
selection in the host, e.g. antibiotic resistance, and one or more expression cassettes. Suitable 
host cells may be transformed/transfected/infected as appropriate by any suitable method 
20 including electroporation, CaCl 2 mediated DNA uptake, fungal infection, microinjection, 
microprojectile, or other established methods. 

Appropriate host cells include bacteria, archebacteria, fungi, especially yeast, 
and plant and animal cells, especially mammalian cells. Of particular interest are E. coli, B. 
subtilis, Saccharomyces cerevisiae, Saccharomyces carlsbergensis, Schizosaccharomyces 
25 pombe, SF9 cells, C129 cells, 293 cells, Neurospora, and CHO cells, COS cells, HeLa cells, 
and immortalized mammalian myeloid and lymphoid cell lines. Preferred replication systems 
include M13, ColEl, SV40, baculovirus, lambda, adenovirus, and the like. A large number 
of transcription initiation and termination regulatory regions have been isolated and shown to 
be effective in the transcription and translation of heterologous proteins in the various hosts. 
30 Examples of these regions, methods of isolation, manner of manipulation, etc. are known in 
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the art. Under appropriate expression conditions, host cells can be used as a source of 
recombinantly produced Mycobacterial-derived peptides and polypeptides. 

Advantageously, vectors may also include a transcription regulatory element 
(i.e., a promoter) operably linked to the algU subunit portion. The promoter may optionally 

5 contain operator portions and/or ribosome binding sites. Non-limiting examples of bacterial 
promoters compatible with E. coli include: trc promoter, p-lactamase (penicillinase) promoter; 
lactose promoter; tryptophan (trp) promoter; arabinose BAD operon promoter; lambda-derived 
PI promoter and N gene ribosome binding site; and the hybrid tac promoter derived from 
sequences of the tip and lac UV-5 promoters. Non-limiting examples of yeast promoters 

10 include 3-phosphoglycerate kinase promoter, glyceraldehyde-3 -phosphate dehydrogenase 
(GAPDH) promoter, galactokinase(GALI) promoter, galactoepimerase promoter, and alcohol 
dehydrogenase (ADH) promoter. Suitable promoters for mammalian cells include without 
limitation viral promoters such as that from Simian Virus 40 (SV40), Rous sarcoma virus 
(RSV). adenovirus (ADV), and bovine papilloma virus (BPV). Mammalian cells may also 

15 require terminator sequences and poly A addition sequences, and enhancer sequences which 
increase expression may also be included. Sequences which cause amplification of the gene 
may also be desirable. Furthermore, sequences that facilitate secretion of the recombinant 
product from cells, including, but not limited to, bacteria, yeast, and animal cells, such as 
secretory signal sequences and/or prohormone pro region sequences, may also be included. 

20 Nucleic acids encoding wild-type or variant subunit polypeptides may also be 

introduced into cells by recombination events. For example, such a sequence can be 
introduced into a cell, and thereby effect homologous recombination at the site of an 
endogenous gene or a sequence with substantial identity to the gene. Other recombination- 
hascd methods, such as non-homologous recombinations or deletion of endogenous genes by 

25 homologous recombination, may also be used. 

algU subunit-derived polypeptides according to the present invention, including 
function-conservative variants, may be isolated from wild-type or mutant M. tuberculosis cells, 
or from heterologous organisms or cells (including, but not limited to, bacteria, fungi, insect, 
plant, and mammalian cells) into which a subunit-derived protein-coding sequence has been 

30 introduced and expressed. Furthermore, the polypeptides may be part of recombinant fusion 



BNSDOCID; <WO 983178flA1_L> 



WO 98/31789 PCT/US98/01244 



10 

proteins. Alternatively, polypeptides may be chemically synthesized by commercially available 
automated procedures, including, without limitation, exclusive solid phase synthesis, partial 
solid phase methods, fragment condensation or classical solution synthesis. 

"Purification" of a a subunit polypeptide refers to the isolation of the 
5 polypeptide in a form that allows its enzymatic activity to be measured without interference 
by other components of the cell in which the polypeptide is expressed. Methods for 
polypeptide purification are well-known in the art, including, without limitation, preparative 
disc-gel electrophoresis, isoelectric focusing, HPLC, reversed-phase HPLC, gel filtration, ion 
exchange and partition chromatography, and countercurrent distribution. For some purposes, 
10 it is preferable to produce the polypeptide in a recombinant system in which the protein 
contains an additional sequence tag that facilitates purification, such as, but not limited to, a 
polyhistidine sequence. The polypeptide can then be purified from a crude lysate of the host 
cell by chromatography on an appropriate solid-phase matrix. Alternatively, antibodies 
produced against the a subunit or against peptides derived therefrom can be used as 
15 purification reagents. Other purification methods are possible. 

The isolated polypeptides may be modified by, for example, phosphorylation, 
sulfation, acylation, or other protein modifications. They may also be modified with a label 
capable of providing a detectable signal, either directly or indirectly, including, but not limited 
to, radioisotopes and fluorescent compounds. 

20 

Screening Methods to Identify Anti-tuberculo sis Agents 

The methods and compositions of the present invention can be used to identify 
compounds that inhibit the function of M. tuberculosis RNA polymerase and thus are useful 
as anti-tuberculosis agents. This is achieved by providing active recombinant algU subunit 

25 according to the present invention, in combination with other components of RNA polymerase, 
in a context in which the inhibitory effects of test compounds can be measured. 

In a preferred embodiment, recombinant M. tuberculosis RNA polymerase 
subunits (a, p, p' plus the a subunit disclosed herein) are purified in milligram quantities 
from E. coli cultures by affinity methods utilizing a hexahistidine tagged a and a subunits. 

30 Enzymatically active holoenzyme is reconstituted using these components. The active 
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polymerase is then incubated in the presence of test compounds to form test mixtures, and in 
the absence of test compounds to form control mixtures. In vitro transcription is then carried 
out using a DNA template containing appropriate promoter and reporter sequences. (See 
Example 3 below.) 

5 In another embodiment, M. tuberculosis RNA polymerase subunits (a, P, P' 

plus the a subunit disclosed herein) are co-expressed in E. coli or another surrogate bacterial 
cell, in conjunction with an appropriate promoter-reporter gene. The ability of test compounds 
to differentially inhibit M, tuberculosis RNA polymerase is then assessed. 

M tuberculosis promoters useful in practicing the invention include without 

10 limitation: hsp 60 promoter (Stover et aL, 1991); cpn-60 promoter (Kong et al., 1993); 85 A 
antigen promoter (Kremer, 1995); PAN promoter (Murray et al., 1992); 16S RNA promoter 
(Ji et al., 1994); and askp promoter (Cirillo et al., 1994). Useful reporter genes include 
without limitation xylE (Curcic et aL, 1994); CAT (Das Gupta et al., 1993); luciferase 
(Cooksey et al., 1993); green fluorescent protein (Dhadayuthap et aL, 1995); and lacZ (Silhavy 

15 et al., 1985). 

It will be understood that the present invention encompasses M. tuberculosis 
RNA polymerases containing the algU c factor disclosed herein, which is used in conjunction 
with particular promoters that are recognized by RNA polymerase containing this o~ factor. 
The invention also encompasses the identification of additional promoters that are recognized 

20 by the particular cr subunit of the present invention. This is achieved by providing a library 
of random M tuberculosis gene fragments cloned upstream of an appropriate reporter gene 
(see above). The library is transformed into M. tuberculosis or M. smegmatis and reporter 
gene expression is measured. Alternatively, the library is transformed into another bacterial 
cell, such as, e.g., E. coli, which expresses M tuberculosis RNA polymerase core subunits as 

25 well as the a subunit of the present invention and cognate promoters that drive reporter gene 
expression. In yet another embodiment, expression of an M: tuberculosis a factor confers new 
recognition properties on E. coli RNA polymerase and permits isolation of promoters utilized 
specifically by a particular M. tuberculosis a subunit. 

Preferably, both in vitro and in vivo screening methods of the present invention 

30 are adapted to a high-throughput format, allowing a multiplicity of compounds to be tested in 
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a single assay. Such inhibitory compounds may be found in, for example, natural product 
libraries, fermentation libraries (encompassing plants and microorganisms), combinatorial 
libraries, compound files, and synthetic compound libraries. For example, synthetic 
compound libraries are commercially available from Maybridge Chemical Co. (Trevillet, 
5 Cornwall, UK), Comgenex (Princeton, NJ), Brandon Associates (Merrimack, NH), and 
Microsource (New Milford, CT). A rare chemical library is available from Aldrich Chemical 
Company, Inc. (Milwaukee, WI). Alternatively, libraries of natural compounds in the form 
of bacterial, fungal, plant and animal extracts are available from, for example, Pan 
Laboratories (Botheli, WA) or MycoSearch (NC), or are readily producible. Additionally, 
10 natural and synthetically produced libraries and compounds are readily modified through 
conventional chemical, physical, and biochemical means (Blondelle et aL, TibTech 14:60, 
1996). preferably using automated equipment, to allow for the simultaneous screening of a 
multiplicity of test compounds. 

Useful anti-tuberculosis compounds are identified as those test compounds that 
15 decrease tuberculosis-specific transcription. Once a compound has been identified by the 
methods of the present invention as an RNA polymerase inhibitor, in vivo and in vitro tests 
may be performed to further characterize the nature and mechanism of the inhibitory activity. 
For example, classical enzyme kinetic plots can be used to distinguish, e.g., competitive and 
non-competitive inhibitors. 
20 Compounds identified as RNA polymerase inhibitors using the methods of the 

present invention may be modified to enhance potency, efficacy, uptake, stability, and 
suitability for use in pharmaceutical formulations, etc. These modifications are achieved and 
tested using methods well-known in the art. 

The present invention is further described in the following examples which are 
25 intended to further describe the invention without limiting the scope thereof. 

Example 1 

In the present Example, the following Materials and Methods were used. 
PCR amplification: Based on the M. leprae cosmid sequence (cosmid B-1620, 
30 Genbank accession #U-00015, position 36121-35942), a set of primers was designed and the 
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sequence of these primers was: 5 ' - ATG AACG AACTGCTCG AG ATCTTGCCTGCC-3 ' (PI) 
and 5'-TCACCCGCCGCGACGATCTCGGACGTCAAC-3'(P2). Amplification was performed 
using lOOng of M. tuberculosis H37Rv genomic DNA using a programmable thermal 
controller (PTC100, MJ Research, Inc.). The PCR conditions were as follows: reaction 

5 volume lOOul; pfu cloned DNA polymerase (Stratagene); 0.2 mM dNTPs (Boehringer- 
Mannheim); lOOng of primer ; one cycle of 94°C for 1 minute, thirty cycle of 94°C for one 
minute, 50°C for one minute and 72°C for one minute. 

Southern blot analysis: Restriction enzyme digests of M. tuberculosis H37Rv 
chromosomal DNAs were electrophoresed on 1 % TAE-agarose gels and transferred to nytran 

10 membranes (Schleicher and Schuell) using a Pressure Blotter (Stratagene). Probe labeling was 
performed using the rediprime DNA labelling system (Amersham) essentially as described by 
the supplier. Hybridization was performed using 6xSSC, SxDenhardt solution, 0.5% SDS, 0.1 
mg per ml Salmon Sperm DNA and 50% formamide. Washing was performed using 2xSSC, 
0.5%SDS, at room temperature for 15 min and O.lxSSC, 0.5% SDS at 37°C for 15 min. 

15 Cosmid hybridizations: A transducing lysate of a cosmid library of M. 

tuberculosis H37Rv genomic DNA in vector pYA3060 was generously supplied by Dr. J. 
Clark-Curtiss. Cosmid-bearing £. coli x2819T (Jacobs et al, 1986) colonies representing 
roughly five genomic equivalents were individually picked to wells of sterile 96- well 
microtiter dishes and propagated at 30°C in Luria broth containing ampicillin at 30 ^g/ml and 

20 thymidine at 50 jig/ml. Colonies were grown overnight at room temperature on the above 
media as nylon filter replicas of the library. Filters were processed for colony hybridization 
by standard methods and probe hybridizations performed as described above. Cosmid DNAs 
were purified using maxiprep columns (Qiagen). 

DNA sequencing and analysis: Plasmid templates for nucleotide sequencing 

25 were purified using maxiprep columns (Qiagen). PCR cycle sequencing (ABI Prizm) was 
carried out with an Applied Biosystems automated sequencer at the Massachusetts General 
Hospital DNA Sequencing Core Facility, Department of Molecular Biology (Boston, MA). 

(a) Cloning of M. tuberculosis algU gene 
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A DNA fragment (180base pair) that contains the M. tuberculosis algU gene 
was identified by using PCR amplification of M tuberculosis H37Rv genomic DNA with 
primers that were derived from the M. leprae cosmid sequence. To determined whether the 
amplified DNA fragment contains the algU gene, the 180 base pair DNA fragment was 
5 subcloned into a pCRScript (Stratagene) plasmid and nucleotide sequences were determined. 
The deduced amino acid sequence of the PCR product showed significant homology to the 
algU sequence from other bacteria (Figure 4). 

HV> Southern hint analysis and isolation of th» fi.ll length M. tuberculosis alf>U gene 
1 0 To see whether the cloned algU gene is a single copy of gene in M. tuberculosis 

Southern blot analysis was performed. The PCR cloned DNA fragment was used as a probe 
to analyzed the M. tuberculosis H37Rv genomic DNA that was digested with endonucleases. 
The PCR cloned DNA probe recognized a single band in each digested chromosomal DNA 
(Figure 2), and it was concluded that the algU gene is a single copy of gene in M. 
IS tuberculosis. 

The full-length algU gene was obtained from a cosmid library of M. 
tuberculosis H37Rv genomic fragments (kindly provided by Dr. J. Clark-Curtiss) using the 180 
bp as a probe. Screening of 552 cosmid-bearing E. coli colonies (representing roughly 5 
genome equivalents) with the algU gene fragment yielded 5 positive clones. One algU- 
20 hybridizing cosmid clone, 4D1 1 was analyzed, and Southern blotting of 4D1 1 DNA digested 
with a panel of restriction enzymes confirmed that the no gross structural rearrangements of 
the a/gC/gene had occurred during cloning (Fig. 3). The l.lkb BamH I, 1.2 kb PvuII and 1 
kb Sma I a/gt/-hybridizing fragments of cosmid 4D11 were subcloned into vector pSK!I+ 
prior to nucleotide sequencing. 

25 

<r) S<-nnence analysis of the M tuberculosis aleU gene 

Nucleotide sequencing was performed on plasmid subclones shown in Figure 
4. The sequence encodes a 675 bp ORF which has an overall G+C composition of 63 % (85 
% for bases occupying the codon third position). Assuming that the ATG at position 53-55 
30 serves as the initiator codon, the ORF is expected to encode a protein of 225 amino acids. A 
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strong match with the consensus sequence for an M. tuberculosis ribosome binding site 
(CAGGTG), (Novick, 1996) is positioned just upstream of the putative ATG codon. 
Examination of more than 63 bp of nucleotide sequence upstream of the translation start site 
did not reveal regions of exact identity with prokaryotic promoter sites. Among a subunits 
5 studied in other bacterial species, the deduced amino acid sequence of the 225 residue M 
tuberculosis protein displayed greatest similarity to the stress related extracellular function 
family of sigma subunits of Streptomyces coelicolor, Pseudomonas aeruginosa, Escherechia 
coli and Hemophilus influenzae (Fig. 4). 

10 Example 2 

Others have shown that overexpressed E. coli RNA polymerase subunits can 
be reconstituted into an enzymaticaliy active protein (Zalenskaya et al., 1990; Kashlev et al., 
1993; Tang et al., 1995). The M. tuberculosis rpoA (Healy et al.), rpoB and rpoC genes 
(Miller et al., 1994) have been cloned and characterized. Using the overexpressed M. 

15 tuberculosis RNA polymerase subunits, the in vitro reconstitution assay to form the 
enzymaticaliy active core enzyme will be performed. Holoenzyme that contains algU sigma 
subunit can be obtained and biochemical analysis of gene regulation in M tuberculosis will 
be studied. Transcription inhibitors that act against the holoenzyme that contains the stress 
related sigma factor will be identified. 

20 Example 3 

Hizh Throughput Screens For Inhibitors of M. Tuberculosis RNA Polymerase and a 

Subunit 

25 High-throughput screens for anti-tuberculosis agents may be performed using 

either an in vitro or in vivo format. In either case, the ability of test compounds to inhibit M. 
tuberculosis RNA polymerase-driven transcription of M. tuberculosis promoters is tested. 

The algU sigma factor of the present invention regulates transcription of 
promoters characterized by the sigma promotor consensus sequence: GAACTT-(N16/17)- 
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TCTgA-N(l-5) (Deretic, et al., 1994; Erickson, et al., 1989; Lipnska, et ah, 1988; Martin, et 
al., 1994; Scharr, et al., 1995). Therefore, this promoter is preferred for use herein. 

a) In vitro screens: 

5 The following procedure is used for cell-free high-throughput screening. A 

Tomtec Quadra 96-well pipetting station is used to add the reaction components to 
polypropylene 96-well dishes. 5 ul aliquots of test compounds dissolved in DMSO (or DMSO 
alone as a control) are added to wells. This is followed by 20 ul of the RNA polymerase 
mixture, which consists of: 10 mM DTT, 200 mM KC1, 10 mM Mg +2 , 1.5 uM bovine serum 
10 albumin, and 0.25 ug reconstituted RNA polymerase. After allowing the test compound to 
interact with the RNA polymerase, 25 \i\ of the DNA/NTP mixture is added, containing: 1 |ig 
template DNA (see above), 4 uM [a- 32 P]-UTP, and 400 uM each CTP, ATP, and GTP. 

After incubation for 30 min at 25°C, the reaction is stopped by addition of 150 
|il 10% trichloroacetic acid (TCA). After incubation at room temperature for 60 min, the 
15 TCA-precipitated RNA is adsorbed onto double-thick glass fiber filtermats using a Tomtec cell 
harvester. The wells of the microtiter plate and the filter are washed twice with 5% TCA and 
bound radioactivity is determined using a Wallac microbeta 1450 scintillation counter. 

Inhibitory activity due to the test compound is calculated according to the 

formula: 

20 (cpm 

positive control " C P m sample) 

% inhibition = x 100 

CpnipQjjiiyg control 

where cpm^^ represents the average of the cpm in wells that received DMSO alone, 
25 and cpin^^ represents the cpm in the well that received test compound. Compounds that 

cause at least 50% inhibition are scored as positive "hits" in this assay. 

As an additional control, rifampicin is used at a concentration of 30 nM, which 

results in a 50-75% inhibition of transcription in this assay. 



30 b) In viv screen: 
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M. tuberculosis RNA polymerase subunits (a, P, p\ and the <r subunit disclosed 
herein) are expressed in E. coli under the control of regulatable promoters by transforming E. 
coli with appropriate plasmids. If the a subunit is expressed, a DNA sequence comprising the 
sigE promoter described above is also introduced into the cells to serve as a template for M. 
5 : tuberculosis-specific transcription. 

In one embodiment, the sigE promoter sequence is linked to a DNA sequence 
encoding the xylE gene product, catechol 2, 3-dioxygenase (CDO). When expressed in the 
E. coli cell, CDO converts catechol to 2-hydroxymuconic semialdehyde, which has a bright 
yellow color (having an absorbance maximum at 375 nm) that is easily detected in whole cells 
10 or in crude extracts. The substrate for this enzyme is a small aromatic molecule that easily 
enters the bacterial cytoplasm and does not adversely affect cell viability. 

In a high-throughput format, aliquots of bacterial cultures are incubated in the 
absence or presence of test compounds, and CDO activity is monitored by measuring 
absorbance at 375 nm following addition of catechol. 

15 

c) Specificity: 

Compounds that score as positive in either the in vitro or in vivo assays 
described above are then tested for their effect on human RNA polymerase II. Those 
compounds which do not significantly inhibit human RNA polymerase II will be further 
20 developed as potential anti-tuberculosis agents. 
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Claims: 



1 1 . An isolated, purified DNA encoding M. tuberculosis RNA 

2 polymerase algU ct subunit. 

1 2. A DNA as defined in claim 1, wherein said DNA has a sequence 

2 selected from the group consisting of the sequence shown in Figure 3, sequence- 

3 conservative variants thereof, and function-conservative variants thereof. 

1 3. A DNA vector comprising the nucleic acid sequence of claim 2 

€. operably linked to a transcription regulatory element. 

1 4. A cell comprising a DNA vector as defined in claim 3, wherein said 

2 cell is selected from the group consisting of bacterial, fungal, plant, insect, and mammalian 

3 cells. 

1 5. A cell as defined in claim 4, wherein said cell is a bacterial cell. 

1 6. An isolated purified polypeptide comprising a polypeptide encoded 

2 hy a DNA as defined in claim 2. 

1 7. A method for high- throughput screening to detect inhibitors of M. 

2 tuberculosis RNA polymerase, said method comprising: 

3 a) providing a mixture comprising 

4 (i) purified M. tuberculosis RNA polymerase containing the algU 

5 subunit, 

6 (ii) a DNA template encoding a promoter sequence that is 

7 recognized by M. tuberculosis RNA polymerase containing said algU subunit; 

8 b) incubating said mixture in the presence of test compounds to form 

9 test samples, and in the absence of test compounds to form control samples, wherein said 
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10 incubating is performed under conditions that result in RNA synthesis in the control 

1 1 samples; 

12 C ) measuring RNA synthesis in said test and control samples; and 

13 d) comparing the RNA synthesis detected in step (c) between said test 

14 and control samples; 

15 wherein an RNA polymerase inhibitor is a test compound that causes a 

16 reduction in RNA synthesis measured in said test sample relative to RNA synthesis 

17 measured in said control sample. 

1 8. A method for high-throughput screening to detect inhibitors of M 

2 tuberculosis RNA polymerase, said method comprising: 

3 a) providing a non-mycobacterial bacterial strain that * 

4 (i ) has been transformed with a DNA template encoding a 

5 promoter sequence that is recognized by M tuberculosis RNA polymerase containing the 

6 algU subunit, and 

7 (ii) expresses enzymatically active M tuberculosis RNA 

8 polymerase, wherein said polymerase comprises a, (3, p\ and the algU a subunits; 

9 b) incubating the bacterial strain of (a) in the presence of test 

1 0 compounds to form test samples, and in the absence of test compounds to form control 

1 1 samples; 

12 C ) measuring RNA synthesis in the test and control samples; and 

13 d ) comparing the RNA synthesis detected in step (c) between the test 

14 and control samples; 

15 wherein an RNA polymerase inhibitor is a test compound that causes a 

16 reduction in RNA synthesis measured in said test sample relative to RNA synthesis 

17 measured in said control sample. 
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FIG. 2A 
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